Pesquisa | Portal Regional da BVS

1.

Classifying early infant feeding status from clinical notes using natural language processing and machine learning.

Lemas, Dominick J; Du, Xinsong; Rouhizadeh, Masoud; Lewis, Braeden; Frank, Simon; Wright, Lauren; Spirache, Alex; Gonzalez, Lisa; Cheves, Ryan; Magalhães, Marina; Zapata, Ruben; Reddy, Rahul; Xu, Ke; Parker, Leslie; Harle, Chris; Young, Bridget; Louis-Jaques, Adetola; Zhang, Bouri; Thompson, Lindsay; Hogan, William R; Modave, François.

Sci Rep ; 14(1): 7831, 2024 04 03.

Artigo em Inglês | MEDLINE | ID: mdl-38570569

RESUMO

The objective of this study is to develop and evaluate natural language processing (NLP) and machine learning models to predict infant feeding status from clinical notes in the Epic electronic health records system. The primary outcome was the classification of infant feeding status from clinical notes using Medical Subject Headings (MeSH) terms. Annotation of notes was completed using TeamTat to uniquely classify clinical notes according to infant feeding status. We trained 6 machine learning models to classify infant feeding status: logistic regression, random forest, XGBoost gradient descent, k-nearest neighbors, and support-vector classifier. Model comparison was evaluated based on overall accuracy, precision, recall, and F1 score. Our modeling corpus included an even number of clinical notes that was a balanced sample across each class. We manually reviewed 999 notes that represented 746 mother-infant dyads with a mean gestational age of 38.9 weeks and a mean maternal age of 26.6 years. The most frequent feeding status classification present for this study was exclusive breastfeeding [n = 183 (18.3%)], followed by exclusive formula bottle feeding [n = 146 (14.6%)], and exclusive feeding of expressed mother's milk [n = 102 (10.2%)], with mixed feeding being the least frequent [n = 23 (2.3%)]. Our final analysis evaluated the classification of clinical notes as breast, formula/bottle, and missing. The machine learning models were trained on these three classes after performing balancing and down sampling. The XGBoost model outperformed all others by achieving an accuracy of 90.1%, a macro-averaged precision of 90.3%, a macro-averaged recall of 90.1%, and a macro-averaged F1 score of 90.1%. Our results demonstrate that natural language processing can be applied to clinical notes stored in the electronic health records to classify infant feeding status. Early identification of breastfeeding status using NLP on unstructured electronic health records data can be used to inform precision public health interventions focused on improving lactation support for postpartum patients.

Assuntos

Aprendizado de Máquina , Processamento de Linguagem Natural , Feminino , Humanos , Lactente , Software , Registros Eletrônicos de Saúde , Mães

2.

Identifying social determinants of health from clinical narratives: A study of performance, documentation ratio, and potential bias.

Yu, Zehao; Peng, Cheng; Yang, Xi; Dang, Chong; Adekkanattu, Prakash; Gopal Patra, Braja; Peng, Yifan; Pathak, Jyotishman; Wilson, Debbie L; Chang, Ching-Yuan; Lo-Ciganic, Wei-Hsuan; George, Thomas J; Hogan, William R; Guo, Yi; Bian, Jiang; Wu, Yonghui.

J Biomed Inform ; 153: 104642, 2024 Apr 14.

Artigo em Inglês | MEDLINE | ID: mdl-38621641

RESUMO

OBJECTIVE: To develop a natural language processing (NLP) package to extract social determinants of health (SDoH) from clinical narratives, examine the bias among race and gender groups, test the generalizability of extracting SDoH for different disease groups, and examine population-level extraction ratio. METHODS: We developed SDoH corpora using clinical notes identified at the University of Florida (UF) Health. We systematically compared 7 transformer-based large language models (LLMs) and developed an open-source package - SODA (i.e., SOcial DeterminAnts) to facilitate SDoH extraction from clinical narratives. We examined the performance and potential bias of SODA for different race and gender groups, tested the generalizability of SODA using two disease domains including cancer and opioid use, and explored strategies for improvement. We applied SODA to extract 19 categories of SDoH from the breast (n = 7,971), lung (n = 11,804), and colorectal cancer (n = 6,240) cohorts to assess patient-level extraction ratio and examine the differences among race and gender groups. RESULTS: We developed an SDoH corpus using 629 clinical notes of cancer patients with annotations of 13,193 SDoH concepts/attributes from 19 categories of SDoH, and another cross-disease validation corpus using 200 notes from opioid use patients with 4,342 SDoH concepts/attributes. We compared 7 transformer models and the GatorTron model achieved the best mean average strict/lenient F1 scores of 0.9122 and 0.9367 for SDoH concept extraction and 0.9584 and 0.9593 for linking attributes to SDoH concepts. There is a small performance gap (â¼4%) between Males and Females, but a large performance gap (>16 %) among race groups. The performance dropped when we applied the cancer SDoH model to the opioid cohort; fine-tuning using a smaller opioid SDoH corpus improved the performance. The extraction ratio varied in the three cancer cohorts, in which 10 SDoH could be extracted from over 70 % of cancer patients, but 9 SDoH could be extracted from less than 70 % of cancer patients. Individuals from the White and Black groups have a higher extraction ratio than other minority race groups. CONCLUSIONS: Our SODA package achieved good performance in extracting 19 categories of SDoH from clinical narratives. The SODA package with pre-trained transformer models is available at https://github.com/uf-hobi-informatics-lab/SODA_Docker.

3.

Initial Antihypertensive Prescribing in Relation to Blood Pressure Among Florida Medicaid and Medicare Recipients in the OneFlorida+ Research Consortium.

Smith, Kayla M; Keshwani, Shailina; Walsh, Marta G; Winterstein, Almut G; Gurka, Matthew J; Libby, Anne; Hogan, William R; Pepine, Carl J; Cooper-DeHoff, Rhonda M; Smith, Steven M.

Hypertension ; 81(2): e7-e9, 2024 Feb.

Artigo em Inglês | MEDLINE | ID: mdl-38232142

Assuntos

Anti-Hipertensivos , Medicaid , Anti-Hipertensivos/uso terapêutico , Pressão Sanguínea , Florida/epidemiologia , Medicare , Estados Unidos/epidemiologia , Humanos , Idoso

4.

Characteristics and Predictors of Apparent Treatment-Resistant Hypertension in Real-World Populations Using Electronic Health Record-Based Data.

Jafari, Eissa; Cooper-DeHoff, Rhonda M; Effron, Mark B; Hogan, William R; McDonough, Caitrin W.

Am J Hypertens ; 37(1): 60-68, 2024 Jan 01.

Artigo em Inglês | MEDLINE | ID: mdl-37712350

RESUMO

BACKGROUND: Apparent treatment-resistant hypertension (aTRH) is defined as uncontrolled blood pressure (BP) despite using ≥3 antihypertensive classes or controlled BP while using ≥4 antihypertensive classes. Patients with aTRH have a higher risk for adverse cardiovascular outcomes compared with patients with controlled hypertension (HTN). Although there have been prior reports on the prevalence, characteristics, and predictors of aTRH, these have been broadly derived from smaller datasets, randomized controlled trials, or closed healthcare systems. METHODS: We extracted patients with HTN defined by ICD-9 and ICD-10 codes during 1/1/2015-12/31/2018, from 2 large electronic health record databases: the OneFlorida Data Trust (n = 223,384) and Research Action for Health Network (REACHnet) (n = 175,229). We applied our previously validated aTRH and stable controlled HTN computable phenotype algorithms and performed univariate and multivariate analyses to identify the prevalence, characteristics, and predictors of aTRH in these populations. RESULTS: The prevalence of aTRH among patients with HTN in OneFlorida (16.7%) and REACHnet (11.3%) was similar to prior reports. Both populations had a significantly higher proportion of Black patients with aTRH compared with those with stable controlled HTN. aTRH in both populations shared similar significant predictors, including Black race, diabetes, heart failure, chronic kidney disease, cardiomegaly, and higher body mass index. In both populations, aTRH was significantly associated with similar comorbidities, when compared with stable controlled HTN. CONCLUSIONS: In 2 large, diverse real-world populations, we observed similar comorbidities and predictors of aTRH as prior studies. In the future, these results may be used to improve healthcare professionals' understanding of aTRH predictors and associated comorbidities.

Assuntos

Anti-Hipertensivos , Hipertensão , Humanos , Anti-Hipertensivos/uso terapêutico , Anti-Hipertensivos/farmacologia , Registros Eletrônicos de Saúde , Fatores de Risco , Hipertensão/diagnóstico , Hipertensão/tratamento farmacológico , Hipertensão/epidemiologia , Pressão Sanguínea , Prevalência

5.

Avenues for Strengthening PCORnet's Capacity to Advance Patient-Centered Economic Outcomes in Patient-Centered Outcomes Research (PCOR).

Waitman, Lemuel R; Bailey, Leonard Charles; Becich, Michael J; Chung-Bridges, Katherine; Dusetzina, Stacie B; Espino, Jessi U; Hogan, William R; Kaushal, Rainu; McClay, James C; Merritt, James Greg; Rothman, Russell L; Shenkman, Elizabeth A; Song, Xing; Nauman, Elizabeth.

Med Care ; 61(12 Suppl 2): S153-S160, 2023 12 01.

Artigo em Inglês | MEDLINE | ID: mdl-37963035

RESUMO

PCORnet, the National Patient-Centered Clinical Research Network, provides the ability to conduct prospective and observational pragmatic research by leveraging standardized, curated electronic health records data together with patient and stakeholder engagement. PCORnet is funded by the Patient-Centered Outcomes Research Institute (PCORI) and is composed of 8 Clinical Research Networks that incorporate at total of 79 health system "sites." As the network developed, linkage to commercial health plans, federal insurance claims, disease registries, and other data resources demonstrated the value in extending the networks infrastructure to provide a more complete representation of patient's health and lived experiences. Initially, PCORnet studies avoided direct economic comparative effectiveness as a topic. However, PCORI's authorizing law was amended in 2019 to allow studies to incorporate patient-centered economic outcomes in primary research aims. With PCORI's expanded scope and PCORnet's phase 3 beginning in January 2022, there are opportunities to strengthen the network's ability to support economic patient-centered outcomes research. This commentary will discuss approaches that have been incorporated to date by the network and point to opportunities for the network to incorporate economic variables for analysis, informed by patient and stakeholder perspectives. Topics addressed include: (1) data linkage infrastructure; (2) commercial health plan partnerships; (3) Medicare and Medicaid linkage; (4) health system billing-based benchmarking; (5) area-level measures; (6) individual-level measures; (7) pharmacy benefits and retail pharmacy data; and (8) the importance of transparency and engagement while addressing the biases inherent in linking real-world data sources.

Assuntos

Medicare , Avaliação de Resultados da Assistência ao Paciente , Idoso , Humanos , Estados Unidos , Estudos Prospectivos , Avaliação de Resultados em Cuidados de Saúde , Assistência Centrada no Paciente

6.

A study of generative large language model for medical research and healthcare.

Peng, Cheng; Yang, Xi; Chen, Aokun; Smith, Kaleb E; PourNejatian, Nima; Costa, Anthony B; Martin, Cheryl; Flores, Mona G; Zhang, Ying; Magoc, Tanja; Lipori, Gloria; Mitchell, Duane A; Ospina, Naykky S; Ahmed, Mustafa M; Hogan, William R; Shenkman, Elizabeth A; Guo, Yi; Bian, Jiang; Wu, Yonghui.

NPJ Digit Med ; 6(1): 210, 2023 Nov 16.

Artigo em Inglês | MEDLINE | ID: mdl-37973919

RESUMO

There are enormous enthusiasm and concerns in applying large language models (LLMs) to healthcare. Yet current assumptions are based on general-purpose LLMs such as ChatGPT, which are not developed for medical use. This study develops a generative clinical LLM, GatorTronGPT, using 277 billion words of text including (1) 82 billion words of clinical text from 126 clinical departments and approximately 2 million patients at the University of Florida Health and (2) 195 billion words of diverse general English text. We train GatorTronGPT using a GPT-3 architecture with up to 20 billion parameters and evaluate its utility for biomedical natural language processing (NLP) and healthcare text generation. GatorTronGPT improves biomedical natural language processing. We apply GatorTronGPT to generate 20 billion words of synthetic text. Synthetic NLP models trained using synthetic text generated by GatorTronGPT outperform models trained using real-world clinical text. Physicians' Turing test using 1 (worst) to 9 (best) scale shows that there are no significant differences in linguistic readability (p = 0.22; 6.57 of GatorTronGPT compared with 6.93 of human) and clinical relevance (p = 0.91; 7.0 of GatorTronGPT compared with 6.97 of human) and that physicians cannot differentiate them (p < 0.001). This study provides insights into the opportunities and challenges of LLMs for medical research and healthcare.

7.

Improving the Quality and Utility of Electronic Health Record Data through Ontologies.

Lin, Asiyah Yu; Arabandi, Sivaram; Beale, Thomas; Duncan, William D; Hicks, Amanda; Hogan, William R; Jensen, Mark; Koppel, Ross; Martínez-Costa, Catalina; Nytrø, Øystein; Obeid, Jihad S; de Oliveira, Jose Parente; Ruttenberg, Alan; Seppälä, Selja; Smith, Barry; Soergel, Dagobert; Zheng, Jie; Schulz, Stefan.

Standards (Basel) ; 3(3): 316-340, 2023 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-37873508

RESUMO

The translational research community, in general, and the Clinical and Translational Science Awards (CTSA) community, in particular, share the vision of repurposing EHRs for research that will improve the quality of clinical practice. Many members of these communities are also aware that electronic health records (EHRs) suffer limitations of data becoming poorly structured, biased, and unusable out of original context. This creates obstacles to the continuity of care, utility, quality improvement, and translational research. Analogous limitations to sharing objective data in other areas of the natural sciences have been successfully overcome by developing and using common ontologies. This White Paper presents the authors' rationale for the use of ontologies with computable semantics for the improvement of clinical data quality and EHR usability formulated for researchers with a stake in clinical and translational science and who are advocates for the use of information technology in medicine but at the same time are concerned by current major shortfalls. This White Paper outlines pitfalls, opportunities, and solutions and recommends increased investment in research and development of ontologies with computable semantics for a new generation of EHRs.

8.

The role of health system penetration rate in estimating the prevalence of type 1 diabetes in children and adolescents using electronic health records.

Li, Piaopiao; Lyu, Tianchen; Alkhuzam, Khalid; Spector, Eliot; Donahoo, William T; Bost, Sarah; Wu, Yonghui; Hogan, William R; Prosperi, Mattia; Schatz, Desmond A; Atkinson, Mark A; Haller, Michael J; Shenkman, Elizabeth A; Guo, Yi; Bian, Jiang; Shao, Hui.

J Am Med Inform Assoc ; 31(1): 165-173, 2023 Dec 22.

Artigo em Inglês | MEDLINE | ID: mdl-37812771

RESUMO

OBJECTIVE: Having sufficient population coverage from the electronic health records (EHRs)-connected health system is essential for building a comprehensive EHR-based diabetes surveillance system. This study aimed to establish an EHR-based type 1 diabetes (T1D) surveillance system for children and adolescents across racial and ethnic groups by identifying the minimum population coverage from EHR-connected health systems to accurately estimate T1D prevalence. MATERIALS AND METHODS: We conducted a retrospective, cross-sectional analysis involving children and adolescents <20 years old identified from the OneFlorida+ Clinical Research Network (2018-2020). T1D cases were identified using a previously validated computable phenotyping algorithm. The T1D prevalence for each ZIP Code Tabulation Area (ZCTA, 5 digits), defined as the number of T1D cases divided by the total number of residents in the corresponding ZCTA, was calculated. Population coverage for each ZCTA was measured using observed health system penetration rates (HSPR), which was calculated as the ratio of residents in the corresponding ZTCA and captured by OneFlorida+ to the overall population in the same ZCTA reported by the Census. We used a recursive partitioning algorithm to identify the minimum required observed HSPR to estimate T1D prevalence and compare our estimate with the reported T1D prevalence from the SEARCH study. RESULTS: Observed HSPRs of 55%, 55%, and 60% were identified as the minimum thresholds for the non-Hispanic White, non-Hispanic Black, and Hispanic populations. The estimated T1D prevalence for non-Hispanic White and non-Hispanic Black were 2.87 and 2.29 per 1000 youth, which are comparable to the reference study's estimation. The estimated prevalence of T1D for Hispanics (2.76 per 1000 youth) was higher than the reference study's estimation (1.48-1.64 per 1000 youth). The standardized T1D prevalence in the overall Florida population was 2.81 per 1000 youth in 2019. CONCLUSION: Our study provides a method to estimate T1D prevalence in children and adolescents using EHRs and reports the estimated HSPRs and prevalence of T1D for different race and ethnicity groups to facilitate EHR-based diabetes surveillance.

Assuntos

Diabetes Mellitus Tipo 1 , Criança , Humanos , Adolescente , Adulto Jovem , Adulto , Diabetes Mellitus Tipo 1/epidemiologia , Prevalência , Registros Eletrônicos de Saúde , Estudos Transversais , Estudos Retrospectivos

9.

An Automated Workflow Composition System for Liquid Chromatography-Mass Spectrometry Metabolomics Data Processing.

Du, Xinsong; Dastmalchi, Farhad; Diller, Matthew A; Brochhausen, Mathias; Garrett, Timothy J; Hogan, William R; Lemas, Dominick J.

J Am Soc Mass Spectrom ; 34(12): 2857-2863, 2023 Dec 06.

Artigo em Inglês | MEDLINE | ID: mdl-37874901

RESUMO

Liquid chromatography-mass spectrometry (LC-MS) metabolomics studies produce high-dimensional data that must be processed by a complex network of informatics tools to generate analysis-ready data sets. As the first computational step in metabolomics, data processing is increasingly becoming a challenge for researchers to develop customized computational workflows that are applicable for LC-MS metabolomics analysis. Ontology-based automated workflow composition (AWC) systems provide a feasible approach for developing computational workflows that consume high-dimensional molecular data. We used the Automated Pipeline Explorer (APE) to create an AWC for LC-MS metabolomics data processing across three use cases. Our results show that APE predicted 145 data processing workflows across all the three use cases. We identified six traditional workflows and six novel workflows. Through manual review, we found that one-third of novel workflows were executable whereby the data processing function could be completed without obtaining an error. When selecting the top six workflows from each use case, the computational viable rate of our predicted workflows reached 45%. Collectively, our study demonstrates the feasibility of developing an AWC system for LC-MS metabolomics data processing.

Assuntos

Hominidae , Software , Animais , Fluxo de Trabalho , Metabolômica/métodos , Espectrometria de Massas , Cromatografia Líquida/métodos

10.

Clinical concept and relation extraction using prompt-based machine reading comprehension.

Peng, Cheng; Yang, Xi; Yu, Zehao; Bian, Jiang; Hogan, William R; Wu, Yonghui.

J Am Med Inform Assoc ; 30(9): 1486-1493, 2023 08 18.

Artigo em Inglês | MEDLINE | ID: mdl-37316988

RESUMO

OBJECTIVE: To develop a natural language processing system that solves both clinical concept extraction and relation extraction in a unified prompt-based machine reading comprehension (MRC) architecture with good generalizability for cross-institution applications. METHODS: We formulate both clinical concept extraction and relation extraction using a unified prompt-based MRC architecture and explore state-of-the-art transformer models. We compare our MRC models with existing deep learning models for concept extraction and end-to-end relation extraction using 2 benchmark datasets developed by the 2018 National NLP Clinical Challenges (n2c2) challenge (medications and adverse drug events) and the 2022 n2c2 challenge (relations of social determinants of health [SDoH]). We also evaluate the transfer learning ability of the proposed MRC models in a cross-institution setting. We perform error analyses and examine how different prompting strategies affect the performance of MRC models. RESULTS AND CONCLUSION: The proposed MRC models achieve state-of-the-art performance for clinical concept and relation extraction on the 2 benchmark datasets, outperforming previous non-MRC transformer models. GatorTron-MRC achieves the best strict and lenient F1-scores for concept extraction, outperforming previous deep learning models on the 2 datasets by 1%-3% and 0.7%-1.3%, respectively. For end-to-end relation extraction, GatorTron-MRC and BERT-MIMIC-MRC achieve the best F1-scores, outperforming previous deep learning models by 0.9%-2.4% and 10%-11%, respectively. For cross-institution evaluation, GatorTron-MRC outperforms traditional GatorTron by 6.4% and 16% for the 2 datasets, respectively. The proposed method is better at handling nested/overlapped concepts, extracting relations, and has good portability for cross-institute applications. Our clinical MRC package is publicly available at https://github.com/uf-hobi-informatics-lab/ClinicalTransformerMRC.

Assuntos

Compreensão , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos , Humanos , Processamento de Linguagem Natural

11.

Characteristics and Predictors of Apparent Treatment Resistant Hypertension in Real-World Populations Using Electronic Health Record-Based Data.

Jafari, Eissa; Cooper-DeHoff, Rhonda M; Effron, Mark B; Hogan, William R; McDonough, Caitrin W.

medRxiv ; 2023 May 01.

Artigo em Inglês | MEDLINE | ID: mdl-37205447

RESUMO

Background: Apparent treatment-resistant hypertension (aTRH) is defined as uncontrolled blood pressure (BP) despite using ≥3 antihypertensive classes or controlled BP while using ≥4 antihypertensive classes. Patients with aTRH have a higher risk for adverse cardiovascular outcomes compared to patients with controlled hypertension. Although there have been prior reports on the prevalence, characteristics, and predictors of aTRH, these have been broadly derived from smaller datasets, randomized controlled trials, or closed healthcare systems. Methods: We extracted patients with hypertension defined by ICD 9 and 10 codes during 1/1/2015-12/31/2018, from two large electronic health record databases: the OneFlorida Data Trust (n=223,384) and Research Action for Health Network (REACHnet) (n=175,229). We applied our previously validated aTRH and stable controlled hypertension (HTN) computable phenotype algorithms and performed univariate and multivariate analyses to identify the prevalence, characteristics, and predictors of aTRH in these real-world populations. Results: The prevalence of aTRH in OneFlorida (16.7%) and REACHnet (11.3%) was similar to prior reports. Both populations had a significantly higher proportion of black patients with aTRH compared to those with stable controlled HTN. aTRH in both populations shared similar significant predictors, including black race, diabetes, heart failure, chronic kidney disease, cardiomegaly, and higher body mass index. In both populations, aTRH was significantly associated with similar comorbidities, when compared with stable controlled HTN. Conclusion: In two large, diverse real-world populations, we observed similar comorbidities and predictors of aTRH as prior studies. In the future, these results may be used to improve healthcare professionals' understanding of aTRH predictors and associated comorbidities. Clinical Perspective: What Is New?: Prior studies of apparent treatment resistant hypertension have focused on cohorts from smaller datasets, randomized controlled trials, or closed healthcare systems.We used validated computable phenotype algorithms for apparent treatment resistant hypertension and stable controlled hypertension to identify the prevalence, characteristics, and predictors of apparent treatment resistant hypertension in two large, diverse real-world populations.What Are the Clinical Implications?: Large, diverse real-world populations showed a similar prevalence of aTRH, 16.7% in OneFlorida and 11.3% in REACHnet, compared to those observed from other cohorts.Patients classified as apparent treatment resistant hypertension were significantly older and had a higher prevalence of comorbid conditions such as diabetes, dyslipidemia, coronary artery disease, heart failure with preserved ejection fraction, and chronic kidney disease stages 1-3.Within diverse, real-world populations, the strongest predictors for apparent treatment resistant hypertension were black race, higher body mass index, heart failure, chronic kidney disease, and diabetes.

12.

Evaluating LC-HRMS metabolomics data processing software using FAIR principles for research software.

Du, Xinsong; Dastmalchi, Farhad; Ye, Hao; Garrett, Timothy J; Diller, Matthew A; Liu, Mei; Hogan, William R; Brochhausen, Mathias; Lemas, Dominick J.

Metabolomics ; 19(2): 11, 2023 02 06.

Artigo em Inglês | MEDLINE | ID: mdl-36745241

RESUMO

BACKGROUND: Liquid chromatography-high resolution mass spectrometry (LC-HRMS) is a popular approach for metabolomics data acquisition and requires many data processing software tools. The FAIR Principles - Findability, Accessibility, Interoperability, and Reusability - were proposed to promote open science and reusable data management, and to maximize the benefit obtained from contemporary and formal scholarly digital publishing. More recently, the FAIR principles were extended to include Research Software (FAIR4RS). AIM OF REVIEW: This study facilitates open science in metabolomics by providing an implementation solution for adopting FAIR4RS in the LC-HRMS metabolomics data processing software. We believe our evaluation guidelines and results can help improve the FAIRness of research software. KEY SCIENTIFIC CONCEPTS OF REVIEW: We evaluated 124 LC-HRMS metabolomics data processing software obtained from a systematic review and selected 61 software for detailed evaluation using FAIR4RS-related criteria, which were extracted from the literature along with internal discussions. We assigned each criterion one or more FAIR4RS categories through discussion. The minimum, median, and maximum percentages of criteria fulfillment of software were 21.6%, 47.7%, and 71.8%. Statistical analysis revealed no significant improvement in FAIRness over time. We identified four criteria covering multiple FAIR4RS categories but had a low %fulfillment: (1) No software had semantic annotation of key information; (2) only 6.3% of evaluated software were registered to Zenodo and received DOIs; (3) only 14.5% of selected software had official software containerization or virtual machine; (4) only 16.7% of evaluated software had a fully documented functions in code. According to the results, we discussed improvement strategies and future directions.

Assuntos

Metabolômica , Software , Metabolômica/métodos , Cromatografia Líquida/métodos , Espectrometria de Massas/métodos , Gerenciamento de Dados

13.

Overtriage, Undertriage, and Value of Care after Major Surgery: An Automated, Explainable Deep Learning-Enabled Classification System.

Loftus, Tyler J; Ruppert, Matthew M; Shickel, Benjamin; Ozrazgat-Baslanti, Tezcan; Balch, Jeremy A; Hu, Die; Javed, Adnan; Madbak, Firas; Skarupa, David J; Guirgis, Faheem; Efron, Philip A; Tighe, Patrick J; Hogan, William R; Rashidi, Parisa; Upchurch, Gilbert R; Bihorac, Azra.

J Am Coll Surg ; 236(2): 279-291, 2023 02 01.

Artigo em Inglês | MEDLINE | ID: mdl-36648256

RESUMO

BACKGROUND: In single-institution studies, overtriaging low-risk postoperative patients to ICUs has been associated with a low value of care; undertriaging high-risk postoperative patients to general wards has been associated with increased mortality and morbidity. This study tested the reproducibility of an automated postoperative triage classification system to generating an actionable, explainable decision support system. STUDY DESIGN: This longitudinal cohort study included adults undergoing inpatient surgery at two university hospitals. Triage classifications were generated by an explainable deep learning model using preoperative and intraoperative electronic health record features. Nearest neighbor algorithms identified risk-matched controls. Primary outcomes were mortality, morbidity, and value of care (inverted risk-adjusted mortality/total direct costs). RESULTS: Among 4,669 ICU admissions, 237 (5.1%) were overtriaged. Compared with 1,021 control ward admissions, overtriaged admissions had similar outcomes but higher costs ($15.9K [interquartile range $9.8K to $22.3K] vs $10.7K [$7.0K to $17.6K], p < 0.001) and lower value of care (0.2 [0.1 to 0.3] vs 1.5 [0.9 to 2.2], p < 0.001). Among 8,594 ward admissions, 1,029 (12.0%) were undertriaged. Compared with 2,498 control ICU admissions, undertriaged admissions had longer hospital length-of-stays (6.4 [3.4 to 12.4] vs 5.4 [2.6 to 10.4] days, p < 0.001); greater incidence of hospital mortality (1.7% vs 0.7%, p = 0.03), cardiac arrest (1.4% vs 0.5%, p = 0.04), and persistent acute kidney injury without renal recovery (5.2% vs 2.8%, p = 0.002); similar costs ($21.8K [$13.3K to $34.9K] vs $21.9K [$13.1K to $36.3K]); and lower value of care (0.8 [0.5 to 1.3] vs 1.2 [0.7 to 2.0], p < 0.001). CONCLUSIONS: Overtriage was associated with low value of care; undertriage was associated with both low value of care and increased mortality and morbidity. The proposed framework for generating automated postoperative triage classifications is reproducible.

Assuntos

Aprendizado Profundo , Adulto , Humanos , Estudos Longitudinais , Reprodutibilidade dos Testes , Triagem , Estudos de Coortes , Estudos Retrospectivos

14.

Postoperative Overtriage to an Intensive Care Unit Is Associated With Low Value of Care.

Loftus, Tyler J; Ruppert, Matthew M; Ozrazgat-Baslanti, Tezcan; Balch, Jeremy A; Shickel, Benjamin; Hu, Die; Efron, Philip A; Tighe, Patrick J; Hogan, William R; Rashidi, Parisa; Upchurch, Gilbert R; Bihorac, Azra.

Ann Surg ; 277(2): 179-185, 2023 02 01.

Artigo em Inglês | MEDLINE | ID: mdl-35797553

RESUMO

OBJECTIVE: We test the hypothesis that for low-acuity surgical patients, postoperative intensive care unit (ICU) admission is associated with lower value of care compared with ward admission. BACKGROUND: Overtriaging low-acuity patients to ICU consumes valuable resources and may not confer better patient outcomes. Associations among postoperative overtriage, patient outcomes, costs, and value of care have not been previously reported. METHODS: In this longitudinal cohort study, postoperative ICU admissions were classified as overtriaged or appropriately triaged according to machine learning-based patient acuity assessments and requirements for immediate postoperative mechanical ventilation or vasopressor support. The nearest neighbors algorithm identified risk-matched control ward admissions. The primary outcome was value of care, calculated as inverse observed-to-expected mortality ratios divided by total costs. RESULTS: Acuity assessments had an area under the receiver operating characteristic curve of 0.92 in generating predictions for triage classifications. Of 8592 postoperative ICU admissions, 423 (4.9%) were overtriaged. These were matched with 2155 control ward admissions with similar comorbidities, incidence of emergent surgery, immediate postoperative vital signs, and do not resuscitate order placement and rescindment patterns. Compared with controls, overtraiged admissions did not have a lower incidence of any measured complications. Total costs for admission were $16.4K for overtriage and $15.9K for controls ( P =0.03). Value of care was lower for overtriaged admissions [2.9 (2.0-4.0)] compared with controls [24.2 (14.1-34.5), P <0.001]. CONCLUSIONS: Low-acuity postoperative patients who were overtriaged to ICUs had increased total costs, no improvements in outcomes, and received low-value care.

Assuntos

Hospitalização , Unidades de Terapia Intensiva , Humanos , Estudos Longitudinais , Estudos Retrospectivos , Estudos de Coortes

15.

Initial Antihypertensive Regimens in Newly Treated Patients: Real World Evidence From the OneFlorida+ Clinical Research Network.

Smith, Steven M; Winterstein, Almut G; Gurka, Matthew J; Walsh, Marta G; Keshwani, Shailina; Libby, Anne M; Hogan, William R; Pepine, Carl J; Cooper-DeHoff, Rhonda M.

J Am Heart Assoc ; 12(1): e026652, 2023 01 03.

Artigo em Inglês | MEDLINE | ID: mdl-36565195

RESUMO

Background Knowledge of real-world antihypertensive use is limited to prevalent hypertension, limiting our understanding of how treatment evolves and its contribution to persistently poor blood pressure control. We sought to characterize antihypertensive initiation among new users. Methods and Results Using Medicaid and Medicare data from the OneFlorida+ Clinical Research Consortium, we identified new users of ≥1 first-line antihypertensives (angiotensin-converting enzyme inhibitor, calcium channel blocker, angiotensin receptor blocker, thiazide diuretic, or ß-blocker) between 2013 and 2021 among adults with diagnosed hypertension, and no antihypertensive fill during the prior 12 months. We evaluated initial antihypertensive regimens by class and drug overall and across study years and examined variation in antihypertensive initiation across demographics (sex, race, and ethnicity) and comorbidity (chronic kidney disease, diabetes, and atherosclerotic cardiovascular disease). We identified 143 054 patients initiating 188 995 antihypertensives (75% monotherapy; 25% combination therapy), with mean age 59 years and 57% of whom were women. The most commonly initiated antihypertensive class overall was angiotensin-converting enzyme inhibitors (39%) followed by ß-blockers (31%), calcium channel blockers (24%), thiazides (19%), and angiotensin receptor blockers (11%). With the exception of ß-blockers, a single drug accounted for ≥75% of use of each class. ß-blocker use decreased (35%-26%), and calcium channel blocker use increased (24%-28%) over the study period, while initiation of most other classes remained relatively stable. We also observed significant differences in antihypertensive selection across demographic and comorbidity strata. Conclusions These findings indicate that substantial variation exists in initial antihypertensive prescribing, and there remain significant gaps between current guideline recommendations and real-world implementation in early hypertension care.

Assuntos

Anti-Hipertensivos , Hipertensão , Humanos , Feminino , Idoso , Estados Unidos/epidemiologia , Pessoa de Meia-Idade , Masculino , Anti-Hipertensivos/uso terapêutico , Medicare , Inibidores da Enzima Conversora de Angiotensina/uso terapêutico , Bloqueadores dos Canais de Cálcio/uso terapêutico , Hipertensão/tratamento farmacológico , Hipertensão/epidemiologia , Antagonistas Adrenérgicos beta/uso terapêutico , Antagonistas de Receptores de Angiotensina/uso terapêutico

16.

Ideal algorithms in healthcare: Explainable, dynamic, precise, autonomous, fair, and reproducible.

Loftus, Tyler J; Tighe, Patrick J; Ozrazgat-Baslanti, Tezcan; Davis, John P; Ruppert, Matthew M; Ren, Yuanfang; Shickel, Benjamin; Kamaleswaran, Rishikesan; Hogan, William R; Moorman, J Randall; Upchurch, Gilbert R; Rashidi, Parisa; Bihorac, Azra.

PLOS Digit Health ; 1(1)2022.

Artigo em Inglês | MEDLINE | ID: mdl-36532301

RESUMO

Established guidelines describe minimum requirements for reporting algorithms in healthcare; it is equally important to objectify the characteristics of ideal algorithms that confer maximum potential benefits to patients, clinicians, and investigators. We propose a framework for ideal algorithms, including 6 desiderata: explainable (convey the relative importance of features in determining outputs), dynamic (capture temporal changes in physiologic signals and clinical events), precise (use high-resolution, multimodal data and aptly complex architecture), autonomous (learn with minimal supervision and execute without human input), fair (evaluate and mitigate implicit bias and social inequity), and reproducible (validated externally and prospectively and shared with academic communities). We present an ideal algorithms checklist and apply it to highly cited algorithms. Strategies and tools such as the predictive, descriptive, relevant (PDR) framework, the Standard Protocol Items: Recommendations for Interventional Trials-Artificial Intelligence (SPIRIT-AI) extension, sparse regression methods, and minimizing concept drift can help healthcare algorithms achieve these objectives, toward ideal algorithms in healthcare.

17.

A large language model for electronic health records.

Yang, Xi; Chen, Aokun; PourNejatian, Nima; Shin, Hoo Chang; Smith, Kaleb E; Parisien, Christopher; Compas, Colin; Martin, Cheryl; Costa, Anthony B; Flores, Mona G; Zhang, Ying; Magoc, Tanja; Harle, Christopher A; Lipori, Gloria; Mitchell, Duane A; Hogan, William R; Shenkman, Elizabeth A; Bian, Jiang; Wu, Yonghui.

NPJ Digit Med ; 5(1): 194, 2022 Dec 26.

Artigo em Inglês | MEDLINE | ID: mdl-36572766

RESUMO

There is an increasing interest in developing artificial intelligence (AI) systems to process and interpret electronic health records (EHRs). Natural language processing (NLP) powered by pretrained language models is the key technology for medical AI systems utilizing clinical narratives. However, there are few clinical language models, the largest of which trained in the clinical domain is comparatively small at 110 million parameters (compared with billions of parameters in the general domain). It is not clear how large clinical language models with billions of parameters can help medical AI systems utilize unstructured EHRs. In this study, we develop from scratch a large clinical language model-GatorTron-using >90 billion words of text (including >82 billion words of de-identified clinical text) and systematically evaluate it on five clinical NLP tasks including clinical concept extraction, medical relation extraction, semantic textual similarity, natural language inference (NLI), and medical question answering (MQA). We examine how (1) scaling up the number of parameters and (2) scaling up the size of the training data could benefit these NLP tasks. GatorTron models scale up the clinical language model from 110 million to 8.9 billion parameters and improve five clinical NLP tasks (e.g., 9.6% and 9.5% improvement in accuracy for NLI and MQA), which can be applied to medical AI systems to improve healthcare delivery. The GatorTron models are publicly available at: https://catalog.ngc.nvidia.com/orgs/nvidia/teams/clara/models/gatortron_og .

18.

Ontology Development Kit: a toolkit for building, maintaining and standardizing biomedical ontologies.

Matentzoglu, Nicolas; Goutte-Gattat, Damien; Tan, Shawn Zheng Kai; Balhoff, James P; Carbon, Seth; Caron, Anita R; Duncan, William D; Flack, Joe E; Haendel, Melissa; Harris, Nomi L; Hogan, William R; Hoyt, Charles Tapley; Jackson, Rebecca C; Kim, HyeongSik; Kir, Huseyin; Larralde, Martin; McMurry, Julie A; Overton, James A; Peters, Bjoern; Pilgrim, Clare; Stefancsik, Ray; Robb, Sofia Mc; Toro, Sabrina; Vasilevsky, Nicole A; Walls, Ramona; Mungall, Christopher J; Osumi-Sutherland, David.

Database (Oxford) ; 20222022 10 08.

Artigo em Inglês | MEDLINE | ID: mdl-36208225

RESUMO

Similar to managing software packages, managing the ontology life cycle involves multiple complex workflows such as preparing releases, continuous quality control checking and dependency management. To manage these processes, a diverse set of tools is required, from command-line utilities to powerful ontology-engineering environmentsr. Particularly in the biomedical domain, which has developed a set of highly diverse yet inter-dependent ontologies, standardizing release practices and metadata and establishing shared quality standards are crucial to enable interoperability. The Ontology Development Kit (ODK) provides a set of standardized, customizable and automatically executable workflows, and packages all required tooling in a single Docker image. In this paper, we provide an overview of how the ODK works, show how it is used in practice and describe how we envision it driving standardization efforts in our community. Database URL: https://github.com/INCATools/ontology-development-kit.

Assuntos

Ontologias Biológicas , Bases de Dados Factuais , Metadados , Controle de Qualidade , Software , Fluxo de Trabalho

19.

Phenotype clustering in health care: A narrative review for clinicians.

Loftus, Tyler J; Shickel, Benjamin; Balch, Jeremy A; Tighe, Patrick J; Abbott, Kenneth L; Fazzone, Brian; Anderson, Erik M; Rozowsky, Jared; Ozrazgat-Baslanti, Tezcan; Ren, Yuanfang; Berceli, Scott A; Hogan, William R; Efron, Philip A; Moorman, J Randall; Rashidi, Parisa; Upchurch, Gilbert R; Bihorac, Azra.

Front Artif Intell ; 5: 842306, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-36034597

RESUMO

Human pathophysiology is occasionally too complex for unaided hypothetical-deductive reasoning and the isolated application of additive or linear statistical methods. Clustering algorithms use input data patterns and distributions to form groups of similar patients or diseases that share distinct properties. Although clinicians frequently perform tasks that may be enhanced by clustering, few receive formal training and clinician-centered literature in clustering is sparse. To add value to clinical care and research, optimal clustering practices require a thorough understanding of how to process and optimize data, select features, weigh strengths and weaknesses of different clustering methods, select the optimal clustering method, and apply clustering methods to solve problems. These concepts and our suggestions for implementing them are described in this narrative review of published literature. All clustering methods share the weakness of finding potential clusters even when natural clusters do not exist, underscoring the importance of applying data-driven techniques as well as clinical and statistical expertise to clustering analyses. When applied properly, patient and disease phenotype clustering can reveal obscured associations that can help clinicians understand disease pathophysiology, predict treatment response, and identify patients for clinical trial enrollment.

20.

A Checklist for Reproducible Computational Analysis in Clinical Metabolomics Research.

Du, Xinsong; Aristizabal-Henao, Juan J; Garrett, Timothy J; Brochhausen, Mathias; Hogan, William R; Lemas, Dominick J.

Metabolites ; 12(1)2022 Jan 17.

Artigo em Inglês | MEDLINE | ID: mdl-35050209

RESUMO

Clinical metabolomics emerged as a novel approach for biomarker discovery with the translational potential to guide next-generation therapeutics and precision health interventions. However, reproducibility in clinical research employing metabolomics data is challenging. Checklists are a helpful tool for promoting reproducible research. Existing checklists that promote reproducible metabolomics research primarily focused on metadata and may not be sufficient to ensure reproducible metabolomics data processing. This paper provides a checklist including actions that need to be taken by researchers to make computational steps reproducible for clinical metabolomics studies. We developed an eight-item checklist that includes criteria related to reusable data sharing and reproducible computational workflow development. We also provided recommended tools and resources to complete each item, as well as a GitHub project template to guide the process. The checklist is concise and easy to follow. Studies that follow this checklist and use recommended resources may facilitate other researchers to reproduce metabolomics results easily and efficiently.

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA